Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boost dandischema to 0.9.* series so we get support for pydantic 2.0 and schema 0.6.5 #1823

Merged
merged 7 commits into from
Feb 21, 2024

Conversation

yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Jan 23, 2024

Closes #1820

TODOs

@yarikoptic
Copy link
Member Author

there seems to be some changes in behavior with the new dandischema to address , some look like

E       AssertionError: assert [{'field': 'a...publishable'}] == [{'field': 'a...publishable'}]
1296
E         At index 0 diff: {'field': 'assetsSummary', 'message': 'Value error, A Dandiset containing no files or zero bytes is not publishable'} != {'field': 'assetsSummary', 'message': 'A Dandiset containing no files or zero bytes is not publishable'}
1297
E         Full diff:
1298
E           [
1299
E            {'field': 'assetsSummary',
1300
E         -   'message': 'A Dandiset containing no files or zero bytes is not publishable'},
1301
E         ?                                                                   ----------- --
1302
E         +   'message': 'Value error, A Dandiset containing no files or zero bytes is not '...
1303
E         
1304
E         ...Full output truncated (3 lines hidden), use '-vv' to show

@candleindark does it ring a bell may be right away?

@candleindark
Copy link
Member

@candleindark does it ring a bell may be right away?

Yes. Pydantic V2 constructs error objects, including error messages, differently in comparison to Pydantic V1. This particular error is resulting from the failure to pass the validator for the assetsSummary field in the PublishedDandiset model.

To adapt to the changes in error generations brought by Pydantic V2, I made changes in the tests in dandischema. Specifically, I made a change for the changed error message from the aforementioned error at https://github.com/dandi/dandi-schema/blob/dbd704ace5b2abf5fb2718cca427c8fff76dcb62/dandischema/tests/test_models.py#L422-L423.

You may have to adjust some of the tests in dandi-archive to adapt to the new error generations in Pydantic V2 if these tests assess errors originate from Pydantic.

@jwodder
Copy link
Member

jwodder commented Jan 23, 2024

Some of the dandi-archive code should also be updated to no longer use now-deprecated dandischema methods; see the warnings emitted at the end of the tests.

@yarikoptic
Copy link
Member Author

@candleindark could you take over this PR and adjust those exception msgs, and see to @jwodder's comment about deprecated invocations, etc?

@candleindark
Copy link
Member

@candleindark could you take over this PR and adjust those exception msgs, and see to @jwodder's comment about deprecated invocations, etc?

OK

@candleindark
Copy link
Member

@yarikoptic @jwodder I have solved the two failures that are related to the new format of error messages (to be pushed). There are four failures remaining. They are all related. I am suspecting the cause of the remaining failures is that in POSTing to create a new dandiset with metadata, the metadata is not written to the database. I am not familiar with Django and how a Django app accesses and organizes the DB, and I having trouble to locate the tables in the DB for storing dandisets and related metadata. In fact, there were no table in the result of the following SQL script.

SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name;

@candleindark
Copy link
Member

candleindark commented Jan 25, 2024

@yarikoptic @jwodder I have solved the two failures that are related to the new format of error messages (to be pushed). There are four failures remaining. They are all related. I am suspecting the cause of the remaining failures is that in POSTing to create a new dandiset with metadata, the metadata is not written to the database. I am not familiar with Django and how a Django app accesses and organizes the DB, and I having trouble to locate the tables in the DB for storing dandisets and related metadata. In fact, there were no table in the result of the following SQL script.

SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name;

I am able to see the tables in running the DB service in "production" mode.

I don't have to execute the steps in https://github.com/dandi/dandi-archive/blob/master/DEVELOPMENT.md#initial-setup for testing, do I?

Do you use SQLite or an equivalent for testing? If you do, where is that DB file?

@yarikoptic
Copy link
Member Author

@dandi/dandiarchive could you please guide us here? or may be @kabilar already got hands dirty enough to know this? (otherwise we will need to try to get into chatting to various GPTs to align with my knowledge of testing django apps ;-) )

@yarikoptic
Copy link
Member Author

Do you use SQLite or an equivalent for testing? If you do, where is that DB file?

I think it is relying on a DB since https://github.com/dandi/dandi-archive/blob/master/.github/workflows/backend-ci.yml creates one. DB would leave in the docker (volume) I guess after compose-up ... if no experts follow up, I will dig it up tomorrow.

@candleindark
Copy link
Member

Do you use SQLite or an equivalent for testing? If you do, where is that DB file?

I think it is relying on a DB since https://github.com/dandi/dandi-archive/blob/master/.github/workflows/backend-ci.yml creates one. DB would leave in the docker (volume) I guess after compose-up ... if no experts follow up, I will dig it up tomorrow.

Yeah, it seems to be a Postgres DB.

I found out more using the following code in a test.

    from django.conf import settings

    databases = settings.DATABASES
    database_engine = settings.DATABASES['default']['ENGINE']

It is a database under the name of test_django with authentication information being the same as the one for development setup.

Additionally, the access to DB in a test is provided by the @pytest.mark.django_db() decorator for the test.

However, for test_dandiset_rest_create, I don't see anything written to the api_daniset table.

@candleindark
Copy link
Member

candleindark commented Jan 26, 2024

Because of the transactional nature of django.db.transaction.atomic() used in the project. To see what's written to the DB through another DB connection in the middle of the test, it seems we need to modify the options to the decorator and use an additional fixture as depicted below.

@pytest.mark.django_db(transaction=True)
def test_dandiset_rest_create(api_client, user, transactional_db):

@candleindark
Copy link
Member

I can now confirm that the metadata was not written correctly to the database in the first place.

The cause of the problem is in

return PydanticDandiset.unvalidated(**version_metadata).json_dict()

I will continue the play tomorrow.

@candleindark
Copy link
Member

candleindark commented Jan 27, 2024

@yarikoptic The remaining failures in tests for the backend are resolved by this PR.

However, for that solution to work, we have to wait for the merge of this PR, dandi/dandi-schema#218, for dandischema and possibly a new version of dandishcema because of it.

@yarikoptic There are failures in the frontend tests as well. Do you have someone else to handle that already?

@waxlamp waxlamp mentioned this pull request Feb 2, 2024
@yarikoptic
Copy link
Member Author

@waxlamp @mvandenburgh - I have adjusted PR description with pointers about json schema changes

@yarikoptic
Copy link
Member Author

@jwodder -- I am concerned about our dandi-cli testing here -- all green sounds too good to be true... looked into a sample run log and it says

collected 711 items / 443 deselected / 268 selected
...
267 passed, 443 deselected

whenever our runs in dandi-cli repo say 708 passed, 1 skipped, 2 xfailed in 724.73s (0:12:04)... so what are will skipping/not testing here and why?

@jwodder
Copy link
Member

jwodder commented Feb 7, 2024

@yarikoptic The CLI integration tests are run with the --dandi-api option, which means that only tests that use the Archive Docker image are run.

@mvandenburgh
Copy link
Member

I pushed two fixes to this branch - one to fix the e2e CI test failures (cd5cd04), and the other to fix the Meditor showing a tab for "Schema Key" (b3db231). I included further explanations in the commit descriptions.

I did some testing locally and things appear to be working after those fixes. I think we should still do extensive testing in staging before cutting a new release.

@yarikoptic
Copy link
Member Author

Great, thank you @mvandenburgh !
I would be all for it to merge and test in staging. But then we must be aware that master has this "experimental" feature ATM so we do not rush to tag for release to expedite some fix.

@yarikoptic
Copy link
Member Author

And who would be to review & potentially approve? meanwhile I will request multiple candidates.

@yarikoptic
Copy link
Member Author

Thank you @mvandenburgh ! Now we need I guess review and a blessing/merge by @waxlamp ?

@yarikoptic
Copy link
Member Author

So, @waxlamp should we merge and test it in staging?

@yarikoptic
Copy link
Member Author

note: that downstream developers (@magland and @alejoe91) are waiting for dandi to switch to pydantic 2.0 since they already did and now want to use dandi python module.

yarikoptic and others added 7 commits February 21, 2024 11:48
 The validator for the `assetsSummary` field in the
 `PublishedDandiset` model in dandischema 0.9.0
 generates an error with a slightly different message
 The validator for the `digest` field in the
 `PublishedAsset` model in dandischema 0.9.0
 generates an error with a slightly different message
 than before
This seems to be a breaking change in pydantics json schema export function.
Pydantic 2.0's JSON schema export no longer specifies a `type` field for
the `schemaKey` property, and leaves it undefined instead. This change
makes it so an undefined type also delegates it as a "basic schema",
meaning it will not be rendered as its own tab in the meditor.
Copy link
Member

@mvandenburgh mvandenburgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased this after merging #1865 , looks like everything is passing as expected 👍

@mvandenburgh mvandenburgh merged commit 5b6c66a into master Feb 21, 2024
11 checks passed
@mvandenburgh mvandenburgh deleted the rf-new-dandischema branch February 21, 2024 17:41
@dandibot
Copy link
Member

🚀 PR was released in v0.3.77 🚀

@dandibot dandibot added the released This issue/pull request has been released. label Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
released This issue/pull request has been released.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support dandischema 0.9.0 and schema version 0.6.5
5 participants